Hierarchical Classification using Shrunken Centroids
نویسندگان
چکیده
There are various types of classifiers that can be trained on gene expression data with class labels. Many of them have an embedded mechanism for feature selection, by which they distinguish a subset of significant genes that are used for future prediction. When dealing with more than two class labels, especially when the number goes up to a dozen or more, people find it useful to know the relative affinity among the classes and different subsets of genes involved in discriminating different groups of classes. It provides them with more information not only when analyzing the relationship among classes, but also when predicting on future instances. We have achieved this by developing a hierarchical adaptation of the nearest shrunken centroid classifier. Here, we demonstrate our new method using a cancer data example.
منابع مشابه
Nearest Shrunken Centroid as Feature Selection of Microarray Data
The nearest shrunken centroid classifier uses shrunken centroids as prototypes for each class and test samples are classified to belong to the class whose shrunken centroid is nearest to it. In our study, the nearest shrunken centroid classifier was used simply to select important genes prior to classification. Random Forest, a decision tree based classification algorithm, is chosen as a classi...
متن کاملNearest shrunken centroids via alternative genewise shrinkages
Nearest shrunken centroids (NSC) is a popular classification method for microarray data. NSC calculates centroids for each class and "shrinks" the centroids toward 0 using soft thresholding. Future observations are then assigned to the class with the minimum distance between the observation and the (shrunken) centroid. Under certain conditions the soft shrinkage used by NSC is equivalent to a L...
متن کاملImproved centroids estimation for the nearest shrunken centroid classifier
MOTIVATION The nearest shrunken centroid (NSC) method has been successfully applied in many DNA-microarray classification problems. The NSC uses 'shrunken' centroids as prototypes for each class and identifies subsets of genes that best characterize each class. Classification is then made to the nearest (shrunken) centroid. The NSC is very easy to implement and very easy to interpret, however, ...
متن کاملContext Aware Group Nearest Shrunken Centroids in Large-Scale Genomic Studies
Abstract Recent genomic studies have identified genes related to specific phenotypes. In addition to marginal association analysis for individual genes, analyzing gene pathways (functionally related sets of genes) may yield additional valuable insights. We have devised an approach to phenotype classification from gene expression profiling. Our method named “group Nearest Shrunken Centroids (gNS...
متن کاملDiagnosis of multiple cancer types by shrunken centroids of gene expression.
We have devised an approach to cancer class prediction from gene expression profiling, based on an enhancement of the simple nearest prototype (centroid) classifier. We shrink the prototypes and hence obtain a classifier that is often more accurate than competing methods. Our method of "nearest shrunken centroids" identifies subsets of genes that best characterize each class. The technique is g...
متن کامل